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ABSTRACT 

The  normal  standard  addition  method  assumes  that,  for 
any  one  analyte  in  a sample  there  is  an  analytical  sensor  which 
responds  to  that  analyte  and  no  other  unknown  in  the  sample. 

When  the  analytical  sensor  is  not  completely  selective,  so- 
called  interference  effects  result  which  can  be  a major  source 
of  error.  The  generalized  standard  addition  method  provides 
a means  of  accounting  for  the  interference  effects,  to  actually 
quantify  the  magnitude  of  the  interferences,  and  simultaneously 
to  determine  the  analyte  concentrations.  The  GSAM  as  presented 
here  uses  multiple  linear  regression  to  analyze  multi-component 
samples  where  the  response-analyte  concentration  relationship 
is  of  some  arbitrary  polynomial  form;  for  a non-linear  polynomial 
relationship,  an  iterative  solution  is  required. 
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The  standard  addition  method  (SAM)  is  well  known  to  all 
analytical  chemists.  A description  of  the  method  can  be  found 
in  almost  every  text  book  on  any  aspect  of  quantitative  chemical 
analysis  including  the  most  basic  texts  used  in  introductory 
courses.  By  its  use  a number  of  sample-and  method-associated 
interferences  can  be  overcome;  it  is  particularly  suited  for 
residual  matrix  effects  and  is  most  often  the  method  of  choice 
for  trace  analysis. 

Assuming  a linear  change  in  response  for  an  increased  con- 
centration of  an  analyte,  the  response  is  measured  before  and 
after  several  successive  additions  of  the  analyte  to  a sample 
of  unknown  analyte  concentration.  Plotting  response  (ordinate) 
by  the  amount  of  standard  added  (abscissa)  the  analyte  concen- 
tration is  found  by  fitting  a line  to  the  data  and  finding  the 
intercept  on  the  abscissa. 

Certain  concepts  which  occur  throughout  this  paper  will 
now  be  defined.  The  "analytical  sensor"  is  that  which  provides 
a measurement  of  one  analytically  valuable  property.  The  analyti- 
cal sensor  is  the  source  of  the  "analytical  signal"  which  may 
undergo  a mathematical  transformation  to  form  the  signal  which 
provides  useful  information  for  sample  analysis.  An  example  of 
4 -single-instrument,  single-sensor  analytical  method  is  the  mea- 
surement of  the  voltage  between  two  electrodes  in  a sample  solu- 
tion (the  pair  constitutes  a single  sensor).  An  example  of  a 
single-instrument^  multiple-sensor  analytical  method  is  the  measure- 
ment of  u.v.  absorbance  spectra  with  a u.v.  spectrophotometer. 
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which  can  provide  absorbance  values  at  different  wavelengths. 

The  instrument  is  a sensor  when  measuring  the  absorbance  at  any 
one  wavelength,  hence  the  u.v.  spectrophotometer  is  a multiple 
sensor  instrument  as  it  can  take  readings  at  different  wavelengths 
and  each  wavelength  can  correspond  to  a single  analyte.  The 
standard  addition  method  assumes  that  for  each  analyte  in  a sample 
there  is  an  associated  analytical  signal  which,  ideally,  is  a 
function  only  of  that  analyte  and  no  other  unknown  sample  compo- 
nent . 

By  defining  the  transformed  analytical  signal  obtained  from 
a sensor  by  some  analytical  method  for  the  fth  analyte  of  a sample 
of  unknown  composition  as  a "response",  R^,  to  the  concentration, 
c,  of  the  analyte  ("concentration"  is  used  here,  though  in  some 
applications  it  may  not  bear  any  meaning  in  relation  to  the  amount 
of  an  analyte  in  a sample),  the  model  implied  by  the  standard 
addition  method  is: 

= c*k^  = (Ac  + Cc)k^  = Ac*k^,  + Qc*k^  (1) 

where  Ac  is  the  known  change  in  concentration , Qc  is  the  unknown 
initial  concentration  of  the  analyte  and  k^  is  the  constant  coef- 
ficient in  the  linear  relation  between  property  l and  the  concen- 
tration of  the  analyte.  This  equation  reveals  several  drawbacks 
to  the  traditional  standard  addition  method:  1)  the  requirement 
that  the  function  relating  response  to  concentration  be  linear, 

2)  the  requirement  that  the  response  be  zeroed,  i.e.,  zero  con- 
centration of  the  analyte  should  evoke  a zero  response,  and  3)  as 
a result  of  2),  if  the  measured  property  of  is  affected  by 
other  components  than  the  one  of  interest,  then  the  effect  of 
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these  components  must  somehow  be  eliminated  from  the  samples. 

Given  a multi-component  mixture  with  several  analytes  of 
interest,  in  order  to  use  the  standard  addition  method  the  ana- 
lyst must  either  A)  use  analytical  methods  which,  in  Kaiser's 
terms  (1),  are  "fully  selective"  so  that  each  response  will  only 
be  affected  by  one  analyte,  allowing  eq.  1 to  be  used,  or  B)  be 
able  to  remove  all  of  the  interfering  components  for  a given 
response  and  a given  analyte  to  allow  the  use  of  eq.  1.  In  a 
large  number  of  cases,  the  requirement  of  full  selectivity  is 
not  obtainable  in  practice,  and  the  isolation  of  each  analyte 
from  all  other  interfering  components  can  present  formidable 
problems.  In  fact,  much  of  the  bulk  of  the  current  analytical 
literature  amounts  to  studies  of  matrix  effects.  There  are 
several  important  specialty  areas  of  analytical  chemistry  (e.g., 
electroanalytical  chemistry,  atomic  emission  spectroscopy,  etc.) 
where  the  relation  between  the  measured  properties  and  the  rela- 
tive amounts  of  various  components  can  be  transformed  into  a 
linear  equation  analogous  to  eq.  1,  but  extended  to  include  con- 
tributions from  several  components  for  each  response  (the  so- 
called  "interference"  effects)  as: 

R£  = l.  cs'ksl  = l,  (Acs  + °cs)ks£  = 2/Cs'ks*  + *°Vks  i (2! 

s=l  s=l  s= 1 S=1 

where  there  are  r analytes  of  interest. 

In  this  paper,  we  present  the  "Generalized  Standard  Addition 
Method"  for  the  simultaneous  determination  of  any  number  of  ana- 
lytes using  analytical  sensors  that  have  responses  defined  by 
equation  2.  The  only  assumptions  are  that  each  response  can  be 
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"zeroed"  and  that  the  number  of  analytical  sensors  is  greater 
than  or  equal  to  the  number  of  analytes.  The  analyst  no  longer 
needs  to  be  concerned  with  the  selectivity  or  interferences  of 
analytical  methods. 

In  formulating  the  generalized  standard  addition  method 
(hereafter  referred  to  as  the  GSAM),  this  paper  will  present  the 
mathematics  for  a simultaneous  multidimensional  analysis  by 
multiple  regression  using  multiple  standard  additions  and  ana- 
lytical sensors  for  the  linear  model  of  eq.  2,  and  for  the  exten- 
sion to  the  quadratic,  cubic  and  higher  order  models  (i.e., 
allowing  kg£  to  depend  on  the  concentrations  of  the  sample  compo- 
nents in  a linear,  quadratic  or  higher-order  manner).  We  include 
the  method  of  the  determination  of  the  initial  concentrations, 
and  we  also  show  how  to  recover  the  coefficients  of  the  model 
(e.g.  the  selectivity  coefficients  if  the  k^'s  are  constants) 
from  the  regression  coefficients.  The  equation  to  be  iteratively 
solved  for  the  initial  concentrations  for  any  non-linear  model 
is  generalized  to  a model  of  arbitrary  degree.  Some  discussion 
is  included  of  the  construction  of  decision  functions  to  help 
avoid  areas  of  local  divergence  of  the  iteration  ( if  any  exist), 
and  the  practical  considerations  of  applying  the  GSAM  are  dis- 
cussed. 
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DEVELOPMENT  OF  THE  METHOD 

The  simple  linear  model  (eq.  2,  with  the  k^'s  as  constants) 
is  developed  first,  in  traditional  vector-matrix  notation,  to 
illustrate  the  motivation  for  the  approach  used.  The  extension 
to  the  quadratic  model,  again  in  traditional  vector-matrix  nota- 
tion, will  then  be  made.  For  the  cubic  model,  new  notation  is 
introduced  which  allows  the  extension  to  a model  of  arbitrary 
degree  to  be  readily  grasped  intuitively.  However,  certain  key 
expressions  are  written  out  in  detail  using  standard  summation 
notation  to  clarify  the  meaning  of  the  more  abstract  notation. 

First  some  background  notation  is  established.  Define 


5 <Rml> 

Rm2  ’ • 

. . , R ) 

’ mp 

(3) 

Cm  S (Cml’ 

Cm2  ’ ' 

’ *»  Cmr 

(4) 

where  Rm  is  the  vector  of  responses  1 through  p (i.e.,  there  are 
p different  analytical  sensors)  from  a sample  after  m standard 
additions  containing  analytes  1 through  r in  concentrations 


listed  in  vector  Cm  have  been  made.  From  eq.  2, 


mi 


I c„„-k 


s = l 


ms  si 


i = 1 , • . . , p 


(5) 


Writing  this  in  matrix  form, 


Rm  = (Rml ’ Rm2  * * * * * Rmp)  = (Cml’  Cm2 ’ * * * ’ Cmr) 


kll  k12’-'klp 
k21  k2  2 • 


krl  ' * * kr£j 


(6) 


m 


C .[K] 
m 


(7) 


i.e.,  [R]  and  [C]  are  the  matrices  of  responses  and  concentrations 
for  the  n successive  standard  additions.  This  gives  the  simple 
formulation : 


[R]  = [C][K]  (9) 

The  matrix  [R]  is  known,  as  it  is  the  matrix  of  measured  responses. 
The  matrix  [C]  is  unknown,  as  the  concentrations  of  the  analytes 
in  the  sample  during  the  process  of  making  standard  additions  are 
unknown.  The  matrix  of  coefficients,  [K],  is  also  unknown. 

It  should  be  noted  that  additions  can  be  made  for  several 
analytes  at  the  same  time  in  the  process  of  making  a single  "stan- 
dard addition"  subject  to  certain  restrictions  detailed  later. 

The  additions  will  therefore  be  called  "multiple  standard  addi- 
tions", or  MSA's,  to  indicate  the  possibility  of  changing  the 
concentrations  of  several  components  at  the  same  time. 

Some  way  of  expressing  the  relationship  in  eq.  9 in  terms  of 
the  known  changes  in  concentration  is  now  desirable.  It  is 
assumed  that  volume  corrections,  etc.,  can  be  made  so  that  any 
error  from  not  knowing  the  original  concentration  in  calculating 
the  net  change  in  concentration  in  the  sample  will  be  negligible. 
Actually  this  is  rarely  a problem  as  minute  volumes  of  high  concen- 
tration standards  can  be  added  so  that  volume  changes  are  negli- 
gible. 
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A separation  of  terms  leads  to, 

CC  3 = [AC]  + [CD]  (10) 

where  [CQ]  is  a matrix  in  which  all  rows  are  identically  Cc , the 
unknown  analyte  concentrations,  and  [AC]  is  the  matrix  of  the  net 
changes  in  concentration  made  through  the  MSA's  (i.e.,  AC^  is  the 
total  change  in  concentrations  made  by  the  end  of  the  mth  MSA,  and 
it  is  the  mth  row  of  [AC]).  Note  that  in  [CD],  each  row  is  the  same 
as  every  other  row  because  the  GSAM  begins  with  a single  sample,  to 
which  successive  MSA's  are  made  to  generate  each  row  of  [R]  and  [AC]. 

The  ability  to  separate  terms  in  c into  terms  in  Ac  and  Qc 
is  crucial  to  what  follows.  This  places  a limit  on  the  possible 
functional  relationships  between  concentration  and  response  that 
the  GSAM  can  accomodate.  However,  as  long  as  one  can  find  a way 
to  transform  the  relationship  so  that  the  aforementioned  criterion 
is  satisfied,  the  GSAM  will  be  applicable  to  the  problem.  This 
will  become  clearer  in  the  derivations  that  follow. 


Linear  Model 

The  simplest  model  is  given  by  assuming  that  the  elements  of 
the  matrix  [K]  are  constants  over  the  range  of  the  experiment; 
i.e.,  that  [K]  is  a constant  matrix.  Then 


[R]  = [aC][K]  + [C0][K]  = [aR]  + [RrJ;  [R03  = 


Rol  **02  * ’ *Rop 
Roi  Ro2***Rop 


Rol  Ro2  * ‘ ,Rop 


and 


[AR]  = [AC] [K] , 

[R]  - [AC] [K]  = C C0 3 [ K ] , 

[R][K]-1  - [AC]  = [Cn]. 


(11) 

(12) 

(13) 

(14) 
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This  latter  step  assumes  that  [K]_1  exists,  which  requires 
that  the  number  of  analytical  signals,  p,  be  the  same  as  the  number 
of  analytes,  r,  as  [K]  is  an  r x p matrix.  If  the  number  of  MSA's 
is  also  restricted  to  the  number  of  analytes,  then  assuming  [AR]-1 
and  [AC]-1  exist,  the  solution  for  [CQ]  is  well  defined,  as  seen  in 


( [R] [AR]-1  - I ) [ AC ] = [CD] 

(15) 

[K]  = [AC]-1 [AR] 

(16) 

Though  the  problem  is  well  defined  when  the  number  of  MSA's  is 
equal  to  the  number  of  analytes,  it  is  always  desirable  to  use 
more  data  to  characterize  experimental  error,  i.e.,  a least  squares 
approach : 

[AC]t[AR]  = [AC]1 [AC] [K]  (17) 

and  assuming  ([AC]t[AC])  1 exists, 

([AC]t[AC])-1[AC]t[AR]  = [K]  (18) 

Recalling  (14),  and  remembering  r equals  p is  necessary  for  [K] 
to  be  invertible,  [CD]  is  found  by: 

([R]([AC]t[AR])-1[AC]t  - I ) [ AC  ] = [CQ].  (19) 

By  checking  the  results  for  each  row  of  [CQ]  , the  "goodness  of  fit" 
can  be  determined,  as  the  rows  should  be  identical.  Note  that  the 
matrix  [K]  can  easily  be  computed  once  [AC]t[AC]  has  been  inverted. 
Hence  both  the  initial  concentrations  and  the  model  coefficients 
are  obtained  essentially  for  the  price  of  the  inversion  of  two 
matrices.  Note  that  the  [AR]  matrix  must  have  rank  p which  means 
that  none  of  the  analytical  sighals  can  be  a linear  combination  of 
the  others. 
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The  preceding  derivation  has  been  produced  here  for  two  reasons: 

1)  as  an  introduction  to  what  follows,  and  more  importantly  2)  as 
a way  for  the  analyst  to  have  access  to  the  fundamental  GSAM  if, 
for  some  reason,  a multiple  regression  routine  on  a computer  system 
is  not  available.  The  above  requires  only  the  inversion  of  two 
small  matrices  which  can  be  done  with  a programmable  calculator. 

It  is,  however,  with  the  use  of  multiple  regression  complete 
with  its  associated  statistics  that  the  GSAM  fully  comes  into  its  own  j 

as  an  analytical  technique.  Equations  7 and  11  are  combined  to  obtain, 

R*  = AC".[K]  + ct-[K]  m = l,...,n  (2CH 

which  can  be  solved  for  [K]  and  CD  using  a little  algebra  and  a 
standard  multiple  linear  regression  program  found  on  most  computers. 
Technically  it  is  a linear  multiple  linear  regression,  the  first 
"linear"  referring  to  the  linear  relationship  of  the  model,  and  the 
second  "linear"  referring  to  the  assumption  by  the  regression  that 
the  independent  variables  (whether  they  are  linear,  square,  cross 
terms,  etc.,  in  some  primary  variable  such  as  change  in  concentra- 
tion here)  are  related  to  the  dependent  variable  by  a multi-linear 
relationship  as  in 

Y = b + a.X,  + a0X0  + a.XQ  + •••  + a X . (21) 

11  22  33  rr 

Using  Ac.^,  Ac2,  ...  , Acp  as  the  independent  variables  for 
tbe  linear  multiple  regression,  the  coefficients  for  each  indepen- 
dent variable  for  each  response  will  be  found.  The  matrix  [K] 
will  be  just  the  matrix  of  these  regression  coefficients  for  the 
r independent  Ac^  variables  and  the  p dependent  response  variables, 

and  the  intercepts  from  the  linear  regression  will  be  the  entries 
— » -1 

of  the  vector  C0[K].  If  r equals  p,  [K]  can  be  found,  and  since 

I 

L _ 
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— » T — 1 — 

(C0[K])[K]  = CQ , CQ  can  be  recovered  from  the  intercepts  of  the 

regression . 

As  seen  from  eq.  16  and  eq.  18  or  19,  the  analyst  should 
make  certain  that  the  [AC]  matrix  encompasses  a minimum  of  r equals 
p independent  vectors;  this  is  necessary  because  of  the  inversion 
of  ([AC]t[AC])  (essentially  the  same  argument  holds  for  the  mul- 
tiple linear  regression)  which  can  be  shown  to  be  invertible  if 
and  only  if  the  n x r matrix  [AC]  contains  r independent  vectors 
(n  > r+1,  because  there  are  r+1  unknowns:  the  r coefficients  and 
the  intercept,  for  each  response).  This  requires  the  analyst  to 

choose  with  some  consideration  the  changes  in  concentrations  he 

. 

introduces:  the  more  orthogonal  the  AC  vectors  are  to  each  other, 

m 

the  more  information  one  effectively  obtains.  However,  in  making 
up  standards  for  the  MSA's  one  can  make  up  standards  containing 
several  components  of  interest.  By  adding  a known  amount  of  one 
of  these  multicomponent  standards,  one  effectively  moves  along  a 
non-axial  line  in  the  concentration  domain  (the  basis  vectors  for 
the  concentration  domain  being  exactly  as  implied  by  the  notation 
used  above).  In  practice,  all  the  analyst  must  take  care  to  do 
is  to  span  the  concentration  domain  through  the  additions  made. 
Taking  the  argument  to  extreme,  a single  standard,  consisting  of 
a mixture  of  all  components  is  insufficient  by  itself  as  it  spans 
only  a single  dimension. 

It  should  be  noted  that  the  matrix  [K]  is  found  in  the  regres- 
sion, so  that  the  analyst  automatically  can  see  the  nature  and 
magnitude  of  the  interferences,  and  can  obtain  a measure  of  the 
selectivity  of  his  experimental  design. 
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It  should  be  evident  to  the  reader  at  this  point  that  there 
is  no  longer  any  particular  advantage  associated  with  a "fully 
selective"  set  of  analytical  sensors;  in  fact,  it  may  actually 
be  better  to  have  a non-selective  set  in  some  cases,  as  this  will 
allow  the  use  of  information  from  several  sources  at  the  same  time 
for  one  analyte,  thereby  obtaining  more  information  than  one  might 
obtain  from  a single  information  source  that  is  "fully  selective". 

Quadratic  Model 

The  assumption  that  the  response  coefficients,  kg£,  are  con- 
stant over  the  working  range  of  the  analytical  sensor  may  not  be 
warranted.  In  that  case,  the  linear  analysis  of  the  previous  sec- 
tion is  insufficient.  The  obvious  next  extension  is  to  quadratic 
terms  by  including  linear  terms  in  the  concentrations  of  the  com- 
ponents in  the  response  coefficients: 


:sl  = .E  + Ys£;  Ys£  is  a constant 


(22) 


where  is  the  response  coefficient-function  of  the  sth  compo- 

nent for  the  £th  response.  This  model  generates  the  following 
equation  for  the  response  t (hereafter  the  subscript  m is  left  off, 
as  the  reader  should  understand  that  the  equations  apply  at  every 
step  of  the  GSAM,  so  there  is  no  need  to  distinguish  the  different 
MSA's  at  this  point): 
r / r 

l cs{  I 

S=1  \1=S 

where  in  the  last  form  a subscript  is  moved  to  a more 
convenient  location  for  what  follows.  The  reason  that  i ranges 
only  from  s to  r in  eq.  22  is  to  remove  redundancy  in  eq.  23  with 
regard  to  cross  terms. 


l~-  I,cs [.?  ai<s°*ci  + ^sl\ 


I .I°s*aisU),ci  + Ecs**s l (23) 

S = 1 1 = S S = 1 
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Again  should  be  rewritten  as  Ac^  + Qc^  to  express  the 
function  in  terms  of  the  changes  in  concentration,  but  because  of 
the  concentration  cross  terms  and  square  terms  in  eq.  23,  a sepa- 
ration of  all  cross  terms,  square  terms,  and  linear  terms  in  the 
Ac^'s,  changes  the  coefficients.  This  can  be  seen  in  a more 
general  sense  from 


[R]  = [C ] [ K]  = [AC+C0][AK+K0]  = [AC][AK]  + [C0][AK]  + [AC][KQ] 

+ [Co][Ko]  (24) 

where  [AK]  is  the  change  in  [K]  due  to  change  in  [C].  In  the 
simple,  linear  model,  [AK]  = [0],  so  the  equation  reduced  to  the 
form  in  eq.  11.  Now,  however,  [AK]  i [0]  and  the  linear  terms  in 
Ac^  arise  from  both  [C0][AK]  and  [AC][KQ]. 

Writing  eqs.  22  and  23  to  correspond  to  the  form  of  eq.  24 


gives, 


ks l = AKs i + oks£  = .1  ai(se)-Aci  + l ai(s£)-0ci  + 

i = s i = s 


(25) 


r l i i 

= I ^cs+ocs)(Aks£+°kS£)  = I Acs‘Aks^  + I ocs’Aks£  + I AcS'oksf  (26) 
^ s = 1 s = l S = 1 S = 1 


+ l ocs*oks l 

S = 1 


l l Aci-ais(£).Acs  + [ Aci‘ais(f) ‘ocs  + ^ Acs’oksf 

» = 1 i=  s s — 1 i = s s = 1 


* l 


ocs "oks  l 


S = 1 


(Qks£  will  not  be  decomposed  into  its  constituent  terms  for 
the  moment).  Keeping  in  mind  the  following  relation: 

l >lAci-ais(^)-0cs  = I IA  ci‘ais^),ocs» 

S = 1 1 = S l-j-  S-l 


(27) 
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the  coefficients  of  the  linear  terms  in  eq.  26  can  be  collected  as: 


ism,ocs  + oku] 


Equation  26  now  condenses  to. 


= i y ac  . ma.  tty  Ac  + i Ac  •(  y 

1 i=i  s=i  1 15  5 i=i  \s=: 


aisU),ocs  + oku 


+ 1 ocs’oks t 

S = 1 

where  the  last  summation  term  is  the  constant  R„,  the  initial 

o t 

value  of  response  £ , before  any  MSA's  have  been  performed. 

This  is  now  a suitable  model  for  multiple  linear  regression. 
The  variables  for  the  regression  are  all  terms  of  the  form  Ac^Ac,. 
(j^i)  and  all  linear  terms  Ac^.  The  regression  analysis  will 
then  provide  the  coefficients  of  these  variables.  From  eq.  29 
it  can  be  seen  that  the  coefficients  of  the  cross  terms  and  square 
terms  are  exactly  the  model  coefficients,  a^g(£).  The  intercept 
for  each  response,  is  also  given  by  the  regression.  How- 

ever, the  coefficients  of  the  linear  terms  are  not  simply  related 
to  the  original  model  linear  coefficients. 

Now  let  be  redefined  by 

H£{AC}  = R£  (30) 

where  R^  is  as  expressed  in  eq . 29  and  AC  is  the  vector  (Ac^jAc^, 
j , . ,Acr);  in  this  way  the  functional  relationship  is  more  expli- 
cit. The  equation  for  the  regression  analogous  to  eq.  29  is  then 
written  in  vector-matrix  notation  as 

R^{A‘C}  = aT?*[AU)]-A-C  + A‘CT-[H(£)]  + R^{C^}  (31) 

where  [A(i)3  = [a.  (4)]  (note  that  [A(£)]  is  a lower  triangular 

is 


in 


matrix,  the  matrix  of  all  regression  coefficients  for  the  cross 
terms  and  square  terms,  which  are  the  same  as  the  model  coeffi- 
cients for  these  terms,  for  the  £th  response),  [H(£>]  is  a col- 
umn vector  of  the  regression  coefficients  of  the  linear  terms  in 
changes  in  concentration,  and  { C0 } is  the  intercept  from  the 

regression.  The  reader  should  keep  in  mind  that  equation  31  is 
a function  of  the  net  changes  in  concentration,  since  these  are 
the  known  independent  variables. 

There  is,  however,  a special  case  when  one  can  use  total 
concentration  as  an  argument  for  the  function;  this  is  when 
LC  = -*C0  : 

R„{-C^}  = ( -ct  )T[A(£  ) ] ( -C^  ) + (-C^)T[HU)]  + vO 

02: 

= (C^)T[A(£)](C^)  - (C^)T[HU)]  + R£(0 


R£ { -C  } corresponds  to  the  response  when  all  of  the  initial 
concentrations  of  the  analytes  have  been  subtracted  from  the 
sample,  and  since 

[R]  = [C ] t K]  = [AC+C  ] [K]  = [-C  +C  ][K]  = [0][K]  = 0 (33) 

vo)  O vJ 

equation  31  leads,  in  this  special  case,  to, 

0 = F.(ct)  = (C*)T[AU)](C*>  - (c!)T[HU)]  + R,{0 

loo  o o l o (34) 

l — 1,  ...  , p 

The  interested  reader  may  actually  show  that  this  is  the  equation 
which  results  when  one  attempts  to  solve  for  CQ  using  eqs.  29  and 
30  to  relate  the  model  and  regression  coefficients. 

The  only  unknown  in  eq.  34  is  CQ;  all  the  rest  of  the  variables 
are  the  regression  coefficients  for  the  responses.  The  problem 
of  solving  for  C*  then  reduces  to  solving  this  system  of  p simul- 
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taneous  equations.  The  general  problem  of  solving 

F(>^  = 0;  F:  Rr > IR^,  one  component  of  F being  F^:  IRr — (35) 

is  a well  known  one,  and  many  theorems  and  techniques  exist  for 
approaching  it  (2).  Note  that  eq.  34  is  a quadratic  in  CQ , so  an 
iterative  technique  such  as  a Newton-Raphson  procedure  is  suggested. 

The  condition  of  the  iteration  problem  will  depend  on  the  regres- 
sion coefficients.  There  are  computer  programs  available  in 
computer  libraries  which  can  solve  eq.  35  even  without  knowing 
the  analytical  form  of  the  Jacobian,  which  is  convenient  as  the 
model  becomes  complex. 

Using  equation  29  the  original  model  coefficients  can  be 
recovered  from  the  regression  coefficients.  This  would  be  of 
interest  to  the  analyst  concerned  not  only  with  knowing  C~,  but 
also  the  values  of  the  model  coefficients  which  can  provide  a 
measure  of  the  selectivity  of  the  responses  to  the  various 
components . 

For  the  following,  it  is  assumed  that  CQ  is  known,  either 
because  a known  sample  was  used  to  begin  with  or  because  eq.  34 
was  solved  for  CQ.  From  equation  29,  the  a^s(£)'s  are  the  regres- 
sion coefficients  of  the  square  and  cross  terms.  The  only  coef- 
ficients left  to  recover  are  the  Yg^'s  from  equation  22.  This 
can  easily  be  done,  using  eqs.  29  and  31  to  give, 

i i r 

J 

s= 


!,aiBU)-o=s  ♦ o kU  * HiU)  * * 'll 

5 - -L  5 - J.  j “ 1 


= (CA(f)]  + [AU)]T).-C^+yu  i = 1,  ...v»  r (36) 

making  use  of  eq.  21,  the  fact  that  H^(£)  is  the  regression 
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coefficient  for  Ac^  for  response  l,  and  the  fact  that  [A(£)] 
is  lower  triangular.  Therefore, 

yu  = H.U)  - ( [A  (•£ ) ] + [AU)]T).*cJ  (37) 

and  since  CT^  is  assumed  known,  either  a priori  or  by  solving 
eq.  34,  all  the  terms  on  the  right  hand  side  are  known. 

Thus  the  GSAM  can  simultaneously  find  the  initial  concentra- 
tions of  all  the  analytes  as  well  as  generate  the  coefficients 
of  eq.  22  to  obtain  the  functional  relation  between  response  and 
concentration  according  to  the  original  model. 


Cubic  Model  6 Models  of  Arbitrary  Degree 

The  above  results  are  now  extended  to  the  cubic  model.  Exten- 
sions to  an  arbitrary  order  to  solve  for  CQ  are  also  formulated,* 
however  the  expansion  and  decomposition  of  the  coefficients  is  only 
done  for  the  cubic  case;  sufficient  information  is  presented  to 
allow  the  interested  reader  to  perform  this  for  higher  degree 
models  if  necessary.  As  shall  be  discussed  later,  there  are  some 
problems  associated  with  extending  the  models  to  high  degrees  if 
there  is  no  underlying,  theoretical  reason  to  do  so. 

For  the  cubic  case  the  equation  analogous  to  eq.  22  becomes, 


r r 


k . = T 7 b.  . (s£ )«c.  *c . + y a.(s£)*c.  + y D 
51  i=s  j = i 1 ^ its  1 1 st- 

and this  generates  the  following  equation  for  response  l: 


(38) 


*1  = I.es*s*  = l,  .1  J.bji(s£>cs-ci’cj  + I .1  ai(s^-cs‘ci 
s=l  S=1  1=S  3=1  J J s=l  1=S 


l 


S = 1 


s 'si 


(39) 
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By  analogy  to  earlier  results,  the  regression  model  for  the  cubic 
case  is: 


where  [nG]  is  the  n-dimensional  matrix  of  regression  coefficients 
for  the  p responses  for  the  terms  containing  (n-l)Ac's.  For 
example,  [ G]  is  the  three  dimensional  matrix  of  coefficients  for 
the  square  and  cross  terms  in  change  in  concentration.  Note  also 
that  these  coefficient  matrices  are  lower  triangular  in  an  (n-1) 
dimensional  sense  (observe  the  limits  of  the  indices  in  eq . 40 
and  recall  that  t runs  from  1 through  p). 

Using  the  same  argument  as  in  the  quadratic  model,  the  zero 
of  the  following  function  must  be  found 


F£(x) 


r 9 

- I Gsf’xs  + 
s=l  s 


E 3 3Gisf,Xs'Xi  ' E 3 3 . 4 G j isf’ 

5=1  1 = S s=l  l=s  3=1  J 


x *x . • 
s 1 


(41) 


since  F(x)  = 0 implies  that  x = CQ . This  can  be  extended  to  a 
regression  model  of  arbitrary  degree  n: 


\ ♦ ,ji  J^b  W*ir*V 


r r 


+ (-l)n  I I •••  l n+1G.  . ..Xi  -Xi  ...Xi 

i =1  i =i  i ii  , 3n ' ' * 11^  11  x2  1n 

1 12  X1  2n  in-i 


and,  as  before,  the  solution  to  F(x)  = 0 must  be  determined. 

The  relationship  between  the  regression  coefficients  and 

the  model  coefficients  for  the  cubic  model  are  determined  with 

r 

the  introduction  of  a notational  convenience:  because  Rp=  7 c -k  », 

c S=1  s sc 

an  effective  dot  product  in  the  index  s,  an  operator  symbol 
"C  }*"  is  used  to  represent  this  operation  (the  standard  products 
of  matrices  and  vectors  can  be  used  with  only  two  indices,  as  in 
the  quadratic  model,  but  now  three  indices  must  be  dealt  with). 

In  other  words,  define 


h * ? < V*st  1 <cs>*ks£ 

s-l 


r r 

/c,c.\*x.=  y y x . • c • c . 

\ S1  1/  SI  SI  S 1 

S-l  1-1 


where  it  is  understood  that  the  product  is  over  the  index  attached 
to  the  variable  inside  the  brackets  for  that  variable's  operation. 
Equation  39  can  now  be  written  as, 

'-■*£  * <cs>*ks£  = <cs'°i>0j>*bij(B*)  * (cs’ci)'>ai<st)  4 (c^*yU  0,61 

where  several  terms  have  been  grouped  into  the  brackets,  subject 
to  the  following  properties: 


Commutative : 


<°s'ci>*1‘si  = <ci*cs>* 


(47) 


I 
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Associative:  (cs>*  «ci,cj>*xsij)  = <<=s.ei)‘ Kcj)‘xsii)  <“8) 

Distributive:  <0^*  ((cj)  *<c  j»  *xsi . = ((^.c^c^c  j»  '>x£i  . (,9) 


All  of  the  above  properties  follow  from  the  definition  of  the 
operator  in  eq.  44  and  eq.  45,  and  they  can  easily  be  verified  by 
the  reader. 

Proceeding  as  before,  separating  c into  Ac  + cD,  the  expres- 
sion for  becomes. 


R*  = <Acs+ocs’Aci+oci’Acj+ocj)*bij(s£)  + <Acs+ocs’Aci+oci>*ai(s£) 

+ <Acs+ocs>*Ys£ 

= «Acs,Ac.,Acj)+(Acs,Aci,0cj)+(Acs,0ci,Acj)+(0cs,Aci,Acj) 


+/Ac  , c.,  cA+/  c ,Ac.,  c.\+/  c , c - ,Ac-\+/  c , c-,  c A)  *b  • • ( s£ ) 
\ s’o  1*0  ]/  \o  s’  1*0  3/  \o  s’o  i*  3/  \o  s’o  1*0  3//  13 

+ f^Ac  ,AcA+/  c ,AcA+^Ac  , c.W  c , c.\)*a.(si) 

\\  s 1/  No  s 1/  \ s o 1/  No  s’o  1//  1 


+ ((Acs  + ocs))*Y 


si 


(50) 


By  collecting  terms  of  similar  power  in  Ac’s,  and  consequently 
renaming  and  reordering  subscripts  (actually,  some  order  was  pre- 
served for  clarity,  being  otherwise  unnecessary  due  to  the  commu- 
tativity of  eq . 47),  equation  50  becomes. 


*(4C:jl'4e52T(<oCj3T(bi2j3<3lf)*bj3i2<5l<>*bjlj2<33<,)*a52(jl4’) 
’<4C31>‘C<oC^’oC33^52J3<3^Hb51i2<33t>*bi23l<W>) 
*<onj2>‘laji(32n*aj2(3lt))  4 xt) 

’(oCjX<oC32,°CJ3?bi233<33t,’<oC32>*a32(3lt)  * '•hf) 


(51) 
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The  reader  will  note  the  essential  symmetry  underlying  the  permu- 
tations of  the  subscripts  in  eq.  51;  this  is  expected  from  the 
symmetry  in  eq.  50. 

The  relation  between  the  regression  coefficients  of  eq.  40 
and  the  model  coefficients  of  eq.  39  can  now  be  seen.  Once  a 
solution  for  cQ  has  been  found,  then  beginning  by  identifying 

4 

the  matrix  [B]  (with  elements  b^(s£))  with  the  matrix  [ G],  the 
matrix  [A]  (with  elements  a.(sl))  can  be  obtained  using  the  coef- 

o 

ficients  of  [ G] , and  then  the  matrix  [y]  (with  elements  y^)  is 
found  using  the  coefficients  of  [ G] . It  is  noteworthy  that  the 
original  model  coefficients  can  be  recovered  with  only  the  simple 
operations  of  vector  and  matrix  multiplication,  addition  and 
subtraction;  there  is  no  matrix  inversion  involved. 

Using  the  notation  introduced  above,  it  should  be  apparent 
to  the  reader  how  to  express  the  regression  coefficients  in  terms 
of  the  model  coefficients  for  a model  of  any  degree.  As  an  aid 
to  understanding,  the  reader  can  go  back  and  solve  the  equations 
for  the  quadratic  case  using  this  new  notation,  checking  the  re- 
sults with  the  ones  obtained  above. 

Rewriting  eq.  43  for  clarity  gives 


as  the  general  function  to  solve  for  F(x)  = [0]  to  find  cQ . 
However,  as  is  usually  the  case  for  higher  order  models,  as  the 
degree  of  the  model  increases,  the  number  of  coefficients  in  the 
model  increases  at  a greater  rate,  despite  the  fact  that  the 
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matrices  of  coefficients  are  "lower  triangular"  in  an  n-1  dimen- 
sional sense.  In  order  for  regression  analysis  to  succeed,  at 
least  one  measurement  must  be  made  for  each  coefficient  to  be 
determined.  Hence,  as  the  degree  of  the  model  rises,  the  number 
of  measurements  that  must  be  made  increases  rapidly.  The  GSAM 
also  becomes  more  demanding  the  greater  the  number  of  analytes 
under  study  as  shown  in  Table  I.  In  addition,  in  a multiple 
linear  regression  analysis  the  covariance  of  various  variables  must 
also  be  considered.  If  some  variables  have  high  intercorrelation, 
then  the  regression  problem  becomes  ill-conditioned,  and  it  may 
not  be  possible  to  obtain  a solution.  Because  the  above  model 
takes  into  account  all  possible  terms,  such  a problem  might  arise 
if  the  degree  of  the  model  is  too  large  for  a given  set  of  mea- 
surements. But  it  should  be  noted  that  not  all  of  the  terms  in 
eq.  42  need  be  included  in  the  regression;  it  may  prove  more  fruit- 
ful for  the  analyst  to  drop  certain  terms  from  the  model  by  not 
including  the  corresponding  variable  in  the  list  of  variables  for 
the  regression.  This  effectively  forces  the  coefficients  of  that 
particular  variable  to  be  zero  in  the  model  of  eq.  42.  For 
example,  suppose  it  was  decided  to  include  only  the  cross  terms 
and  not  the  square  terms  of  the  quadratic  model.  Then  the  variables 
used  for  the  multiple  linear  regression  would  be  the  Ac^'s  and  the 
■Ac^-ACj's  for  j<i.  The  analyst  should  remember,  however,  that  there 
is  an  important  difference  between  forcing  a regression  coeffi- 
cient to  zero  by  not  including  the  corresponding  variable,  and 
calculating  a zero  coefficient  from  the  regression  when  the  vari- 
able is  included. 
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Several  excellent  methods  can  be  used  to  select  a specified 
number  of  variables  from  among  a larger  list  of  variables  in  order 
to  obtain  a good  fit  to  the  measurements.  Stepwise  regression  is 
one  example  of  a forward  selecting  method.  In  fact,  a number  of 
these  methods  are  available  in  computer  program  libraries.  This 
means  that  by  specifying  the  maximum  size  of  the  set  the  analyst 
will  allow,  the  number  of  regression  coefficients  the  model  will 
have  at  any  one  time  can  be  controlled,  and  the  possibility  of 
two  variables  of  high  correlation  being  used  in  the  model  at  the 
same  time  is  reduced.  This  in  turn  allows  models  of  high  degree 
to  be  considered,  without  requiring  an  excessive  number  of  data 
points . 
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TABLE  I 


Model 


Formula 


; Minimum  No.  of  Data  Points- 


p=r=  2 


10 
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The  GSAM  can  be  generalized  even  further  than  has  been  done. 

The  restriction  is  that  terms  in  c be  separable  into  terms  in  Ac 
and  cq . The  word  "separation"  is  deliberately  vague,  for  the 
nature  of  this  separation  will  depend  on  several  factors.  If 

R{c}  = R{  Ac*  + c^}  = 4>{Ac*}  (53) 

where  the  terms  in  c are  now  mixed  in  with  other  coefficients, 

o ’ 

using  the  same  argument  as  before,  in  general, 

${-(x)}  = 0 (54) 

. ^ 

must  be  solved  as  the  solution  is  x = C . In  principle  it  is 
possible  to  develop  routines  for  multiple  regression  on  functions 
other  than  linear  sums  of  independent  variables.  If,  in  a parti- 
cular problem,  the  theoretical  relation  can  be  transformed  to 
one  which  is  suitable  for  a multiple  regression,  and  if  the  terms 
in  Cq  will  combine  with  other  coefficients  to  make  up  the  regression 
coefficients  when  the  relation  is  expressed  as  a function  of  the 
changes  in  concentrations,  then  the  fitted  regression  curve  can 
be  used  to  solve  for  CQ  in  eq.  54  (perhaps  iteratively,  perhaps 
by  some  other  means,  depending  on  the  nature  of  the  function  in 
the  regression),  and  possibly  recover  the  original  model  coeffi- 
cients. The  severity  of  the  required  separation  between  terms  in 
Ac  and  cq  thus  depends  on  the  type  of  function  used  for  a model 
of  the  relationship  between  response  and  concentration  and  on  the 
type  of  function  used  for  multiple  regression.  As  an  example, 
if  the  only  available  technique  is  multiple  linear  regression, 

then  the  relation  y = ln(ac)  = ln(aAc  + ac  ) does  not  have  a 

o 
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sufficient  separation  between  terms  in  Ac  and  cQ,  i.e.,  the 
normal  equations  using  ln(Ac)  as  the  independent  variable  will 
be  nonlinear.  Transforming  the  relation  by  exponentiation  solves 
this  problem,  and  the  separation  is  then  sufficient  for  a mul- 
tiple linear  regression. 

In  this  paper  attention  has  been  restricted  to  polynomial 
forms,  which  allow  multiple  linear  regression  to  be  used.  How- 
ever, the  basic  concepts  behind  the  GSAM  can  still  be  put  to 
work  for  quantitative  chemical  analysis  and  the  understanding  of 
systems  of  analytical  sensors  with  models  having  other  than 
simple  polynomial  forms. 


A 
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CONSIDERATIONS  IN  THE  APPLICATION  OF  THE  GSAM 
Error  Analysis 

1)  Random  Error:  Along  with  each  coefficient,  multiple  linear 
regression  analysis  also  provides  an  estimate  of  the  variance  in 
that  coefficient  which  can  be  propagated  to  the  model  coefficients 
or  initial  concentrations  (see  Larson  et.  al • for  this  in  the 
standard  addition  method  (3)).  In  many  cases,  particularly  for 
the  iterative  solution  required  for  the  quadratic  and  higher  order 
models,  this  can  become  very  tedious.  Another  estimate  of  the 
error  can  be  made  based  on  the  fact  that  with  each  regression 
coefficient  upper  and  lower  confidence  bounds  can  be  obtained. 

Given  confidence  bounds  of,  say,  90%  or  95%,  a measure  of  the 
error  in  a regression  coefficient  can  be  taken  as  the  following: 


err 


(x)  _ |U.C.B.(x)  - L.C.B.(x) 


(55) 


or  err(x)  = max {U .C . B . (x) -x , x-L.C.B.(x)} 

where  "U.C.B."  and  "L.C.B."  are  the  upper  and  lower  confidence 
bounds.  An  error  estimate  for  the  solution  of  eq . 52  for  CD 
would  appear  most  readily  obtainable  by  perturbing  the  regression 
coefficients  within  the  confidence  bounds  and  performing  the 
iteration  again  to  solve  for  a new  C0 . The  error  in  the  initial 
concentration  can  then  be  estimated  by, 


err<oCi>  \.,max  - 0^1' 

K* A ) • • • ) X 


(56) 


k-j^  . . 

where  QC^  corresponds  to  the  kth  perturbed  iterative  solution  for 
the  initial  concentration  of  component  i.  If  t,  the  total  number 
of  perturbations  carried  through,  is  large  enough  (which  may  not 
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be  practical  if  the  solution  does  not  converge  quickly),  a better 
estimate  might  be 


Using  95%  confidence  bounds  can  be  expected  to  provide  a conser- 
vative estimate  of  the  precision  of  the  solution. 

It  is  sometimes  desired  to  weight  certain  measurements  more 
highly  than  others  (perhaps  due  to  experimental  design,  errors 
are  smaller  in  a certain  range  of  concentrations),  in  which  case 
a weighted  multiple  linear  regression  could  be  envisaged.  The 
introduction  of  weights,  however,  greatly  complicates  the  statis- 
tics of  the  regression. 

2)  Systematic  Error:  In  reality,  the  response  function  may 
not  be  zeroed  when  the  concentrations  are  zero.  In  this  case  the 
response  function  is  better  represented  by 


[R]  = [C] [K]  + [6R]  = [AC] [K]  + [qC][K]  + [6R]  (58) 

The  effective  change  is  that 

[1G]  = [qC][K]  + [ 6RD  (59) 

is  the  regression  intercept.  Comparison  with  eq.  51  shows  that 
F(x)  = — [ 6 R 3 (60) 


is  actually  the  equation  that  must  be  solved.  The  problem  of 
iteratively  solving  a perturbed  system  instead  of  the  ideal  sys- 
tem, and  the  effect  the  perturbation  has  on  the  solution,  is  a 
well  known  one,  though  not  so  well  understood.  The  specific 
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effects  of  the  perturbation  will  depend  on  the  application  and 
the  nature  of  the  function  F(x)  that  is  produced. 

If  6R  is  small  compared  to  values  of  R,  and  the  problem  is 
reasonably  insensitive  to  perturbations  (for  example,  a problem 
with  large  gradients  that  converges  quickly) , then  one  can  expect 
the  effect  on  the  final  solution  to  be  small.  By  using  a "blank" 
to  obtain  a [pQ]  (effectively  the  background  signal),  and  using 

CR ' ] = [R]  - [pc]  (61) 

as  the  reponse  function,  [R']  will  usually  be  expected  to  follow 
eq.  58  with  [6R]  being  small  (an  exception  to  this  is  considered 
below).  The  problem  of  zeroing  the  response  function  is  simply 
the  multidimensional  generalization  of  the  same  problem  with  the 
standard  addition  method. 

Notes  on  the  Iterative  Solution  of  Equation  35 

Most  methods  for  solving  eq.  35  by  iteration  require  an 
initial  approximation,  a "guess",  as  to  the  final  solution  in  order 
to  start  the  iteration.  The  speed  of  convergence  will  often 
depend  on  the  accuracy  of  the  estimate;  if  several  zeroes  exist, 
convergence  to  the  "correct"  zero  (the  zero  corresponding  to  the 
initial  analyte  concentrations)  will  also  depend  on  the  initial 
approximation.  It  would  be  possible,  for  example,  for  the  itera- 
tion to  converge  to  a solution  with  large  negative  concentration 
values  in  some  problem — obviously  a solution  that  should  be  rejected, 
but  nevertheless  a potential  solution  which  could  be  found  by  the 
iteration. 

The  natures  of  local  and  global  convergence  have  been  studied 
for  iterative  solutions  to  eq.  35  and  some  elegant  theorems  exist, 
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which  are  beyond  the  scope  of  this  paper.  The  analyst  using  the 
GSAM  for  a non-linear  model  should  be  aquainted  with  the  theorems 
and  their  conditions,  in  order  to  recognize  if  the  nature  of  his 
function  in  eq.  51  is  such  that  certain  theorems  (Ortega  and  Rhein- 
boldt  (2))  concerning  convergence  and  divergence  (local  and 
global)  apply. 

An  estimate  of  an  interval  in  which  to  start  the  iteration 
can  be  obtained  as  follows.  To  obtain  the  best  values  for  CQ, 
the  analyst  should  attempt  to  collect  most  of  the  data  (obtained 
by  the  MSA's)  near  the  starting  response  values  R^(0),  £=l,..,p, 
as  this  will  make  the  regression  model  fit  the  "true"  function 
nearest  the  point  of  interest,  namely  CQ . This  means  that,  for 
a sample  of  components  of  completely  unknown  concentrations,  the 
analyst  must  begin  with  low  concentration  MSA's  which  will  result 
in  small  changes  in  concentration  of  each  analyte  in  the  sample 
until  a change  in  the  response  from  one  sensor  is  achieved.  At 
this  point,  the  unknown  initial  concentration  must  be  within  a 
few  orders  of  magnitude  of  the  known  added  concentration,  depend- 
ing on  the  size  of  the  response  change  that  can  be  detected.  (In 
this  method  for  establishing  a smarting  interval,  it  is  assumed 
that  the  concentration  of  only  one  component  at  a time  is  being 
changed  as  shall  be  made  clear  in  the  following  discussion.)  This 
motivates  the  definition 


Aci  = min/kCi|a  * max 

=1  - JS41%b5^<  6) 
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Ac  ^ is  then  the  smallest  added  concentration  of  analyte  i which 
produces  a relative  response  change  in  at  least  one  analytical 
sensor  greater  than  or  equal  to  (axl00)%.  Ac^  is  the  largest 
added  concentration  of  analyte  i which  produced  a relative  res- 
ponse change  no  greater  than  (6x100)%  for  any  sensor.  The  interval 
in  which  to  start  the  iteration  is  then  [Ac^,Ac^].  a and  6 are 
the  parameters  which  define  the  width  of  the  interval.  For  example, 
if  a=0.5  and  6=10  then  Ac^  is  the  added  concentration  which  gives 
at  least  a 50%  response  change,  in  at  least  one  analytical  sensor 
and  Ac^  is  the  added  concentration  which  gives  a 1000%  response 
change  on  no  more  than  one  sensor.  In  both  cases,  one  looks  at 
the  response  most  sensitive  to  changes  in  concentration  for  ana- 
lyte i (most  sensitive  in  the  sense  of  largest  relative  gradient). 

When  o=6=l,  the  change  in  analyte  i required  to  double  the 
response  in  the  most  sensitive  sensor  for  analyte  i is  found.  If 
a linear  model  is  assumed,  then  Ac.=Ac.  would  be  the  same  size 

—l  l 

as  qc^,  provided  that  the  most  sensitive  sensor  for  component  i 
in  this  concentration  range  is  not  affected  by  the  presence  of 
other  components  than  i nearly  as  much  as  it  is  by  i.  A "fully 
selective"  set  of  sensors  would  have  this  property  no  matter  what 
the  concentrations  of  the  components.  This  can  be  seen  in  eq.  66 
given  the  condition  of  eq.  65. 

R^{c}  = k^'c^  = k^*Ac^  + = k^'Ac^  + R^fO}  £=l,..,p=r  (65) 

R^{Ac)  = 2R^{ 0}  k^  * Ac^  = R^ { 0 } ^ Ac^  = cc£  £=l,..,p=r  (66) 

In  any  case,  for  most  reasonable  models,  a=0.1  and  6=10 
provide  a broad  enough  interval  so  that,  of  several  starting  points 
in  the  interval,  at  least  one  should  be  close  enough  to  to  allow 
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convergence.  The  parameters  a and  6 allow  moderate  interference 
effects  to  be  taken  into  account.  Under  extenuating  conditions 
(e.g.,  large  interference  effects  on  all  responses  for  analyte  i), 
the  parameters  a and  g would  have  to  be  adjusted  by  the  analyst, 
unless  the  regression  coefficients  provide  good  convergence  quali- 
ties to  the  iteration  (i.e.,  if  global,  or  a very  large  region 
of  rapid  convergence  exists,  then  the  analyst  can  take  any  reason- 
able starting  guess)  . 

It  should  be  noted  that  the  above  technique  has  assumed  that 
large  relative  changes  in  the  concentrations  of  interfering  compo- 
nents in  the  sample  has  not  occurred,  as  these  could  interfere 
with  the  responses  to  changes  in  analyte  i's  concentration.  Instead 
of  using  eqs.  65  and  66,  which  require  using  a fresh  amount  of  the 
unknown  sample  for  each  component  to  determine  the  starting  point 
for  the  iteration,  the  following,  slightly  less  accurate  method 
is  suggested: 

1)  To  a given  amount  of  the  unknown  sample  make  successive  changes 
in  the  concentration  of  analyte  ] only  until  the  criterion  for 

Ac^  has  been  surpassed. 

2)  Then  make  successive  changes  in  the  concentration  of  analyte  2 


only,  until  the  criterion  for  ACj  ^as  been  surpassed,  using  not 
R^^{0}  but  ^R^tO},  the  final  value  of  response  1 2 after  the  addi- 
tions of  step  1 is  that  response  which  is  most  sensitive  to 

analyte  2),  in  order  to  subtract  out  the  response  due  to  the  presence 
of  components  before  the  additions  of  analyte  2. 


3)  Repeat  for  analyte  3,  using  2R»  {0},  defined  similarily  to 

1R»  {0},  in  the  criterion  for  Aco  and  continue  with  the  rest  of 
t-2  0 

the  analytes . 
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The  eqs . 65  and  66  would  be  altered  to  appear  as 


| R/  {7Tc } - (i~1)Rp{Q>  | 


) 


Ac.  = max<Ac- | max 
1 l 1 £=l,..,p 


Rp{ffc}  - (l_1)Rp{0}l  . „ "l 


n~ToT 


* Bi 


■J 


(67) 


(68) 


(the  subscript  on  a and  6 denotes  the  fact  that  different  para- 
meters may  be  needed  for  different  components).  Performing  the 
above  operation  should  give  the  analyst  some  idea  of  where  C^" 
lies,  and  he  can  then  use  another  sample  of  the  unknown  to  collect 
more  data  with  simultaneous  additions  of  several  components,  thus 
allowing  a more  detailed  exploration  of  the  functional  surface  for 
the  regression,  if  it  is  needed.  Adjustment  of  the  parameters 
will  require  some  examination  of  the  interference  coefficients, 
so  the  analyst  may  wish  to  run  a rough  preliminary  GSAM  to  obtain 
some  idea  of  the  magnitudes  of  the  interferences. 

The  reader  should  keep  in  mind  that  if  he  is  fortunate  enough 
to  have  a problem  which  requires  only  the  linear  model,  he  need 
not  concern  himself  with  the  iterative  process  or  with  finding 
some  interval  to  start  the  iteration.  Also,  the  problem  of  finding 
the  initial  starting  point  for  the  GSAM  is  the  same  problem  facing 
the  analyst  applying  the  simple  standard  addition  method. 


Dijnensionality 

Up  until  now  little  attention  has  been  paid  to  the  dimensions 
or  ranks  of  the  various  matrices.  Yet,  as  the  reader  may  already 
suspect,  this  is  of  great  importance.  As  was  stated  earlier  in 
the  linear  model,  it  is  required  that  the  [AC]  matrix  span  an 
r dimensional  space,  where  r is  the  number  of  analytes.  That  this 
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is  also  a requirement  for  all  higher  degree  models  can  be  intui- 
tively grasped  through  the  realization  that  r independent  AC^'s 
are  needed  to  allow  multiple  linear  regression  to  recover  all  of 
the  coefficients  for  each  analyte;  with  less  than  r,  at  least  one 
of  the  analyte  concentrations  could  then  be  expressed  as  a linear 
function  of  the  others',  and  the  regression  for  r analytes  would 
no  longer  be  meaningful.  Also  stated  in  the  linear  model  was 
that  p,  the  number  of  linearly  independent  analytical  signals,  had 
to  equal  r.  This  extends  to  the  model  of  arbitrary  degree.  If 
p is  less  than  r,  then  sufficient  information  cannot  be  obtained 
to  determine  the  initial  concentrations  of  all  analytes,  i.e., 
F(x)=[0]  is  p equations  in  r unknowns,  and  so  would  have  an  infi- 
nite number  of  solutions.  If  p is  greater  than  r,  then  the  simul- 
taneous equations  cannot  be  solved  for  all  p sensors  (4). 

Of  course,  one  can  select  r of  the  p sensors  and  solve  the 
equation,  then  select  a new  set  of  r and  solve  again  and  so  on, 
finally  comparing  the  results  for  each  iteration,  which  should 
be  identical.  This  may  be  useful  if  the  analytical  sensors  are 
very  non-linear  leading  to  several  possible  solutions  for  a given 
iteration.  In  most  cases,  restricting  solutions  to  have  nonnegative 
or,  if  negative,  close  to  zero  concentrations  should  be  sufficient 
to  eliminate  all  solutions  except  one.  If  not,  then  by  comparing 
the  sets  of  solutions  obtained  from  each  set  of  r out  of  p infor- 
mation sources,  the  analyst  may  find  only  one  solution  that  is 
consistent  in  all  sets,  which  hopefully  is  Co.  This  is,  however, 

v 

a worst  case  scenario,  as  the  majority  of  problems  will  not  behave 
this  badly;  p=r  sensors  should  be  sufficient  to  provide  the 
solution  C^. 


An  important  consequence  of  the  above  discussion  surfaces 
when  there  is  a component  in  the  sample  that  is  not  an  analyte 
but  affects  one  or  more  of  the  responses,  and  if  that  component's 
concentration  is  unknown  so  that  it  cannot  be  subtracted  out  as 
background  with  the  blank  in  eq.  61.  Then,  another  sensor  must 
be  included  to  make  p=r+l  because  the  model  must  take  into  account 
the  interference  of  the  component  even  though  it  is  not  of  interest. 
The  advantage  of  the  GSAM  is  that  the  response  of  this  sensor  need 
only  be  a function  of  the  concentrations  of  any  of  the  components 
in  the  sample  that  affect  the  other  responses  without  regard  for 
the  number  or  combination  of  components  involved.  With  this  free- 
dom, the  problem  should  be  easily  resolved  in  most  cases.  The 
above  condition  can  be  expressed  using  a "binary  dependence" 
matrix,  D,  which  is  r ' x p and  whose  elements  are  defined  by 


, _ / 1 if  component  i affects  response  j 

lj  ~ ^0  otherwise 


(69) 


where  it  is  understood  that  none  of  the  p analytical  signals  are 
linearly  dependent  on  the  others.  The  GSAM  requires  only  that 


the  rank  of  D be  r',  where  r'  is  the  total  number  of  sample  com- 
ponents which  affect  the  p responses.  Naturally,  if  an  analyst 
is  interested  in  only  two  analytes  in  a mixture  of,  say,  ten,  then 
by  using  two  sensors  that  are  affected  only  by  the  two  analytes 
'there  is  no  need  to  worry  about  more  sensors.  The  case  corresponds 
to  a 2 x 2 diagonal  submatrix  of  the  matrix  D required  for  analysis 
of  all  ten  components;  note  that  D is  required  only  to  be  square, 
but  as  stated  before,  one  can  have  p>r. 


The  number  of  MSA's  the  analyst  must  make  to  just  have  enough 
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information  to  determine  the  regression  coefficients  depends  on 
the  degree  of  the  model  and  the  number  of  components  under  study. 
Table  I gives  the  formulas  and  some  selected  values  for  the  linear, 
quadratic,  and  cubic  models;  from  the  general  derivation  of  the 
models  (keeping  in  mind  the  (n-1)  dimensional  triangularity  of 
the  n dimensional  [nG]  coefficient  matrix),  the  reader  should  see 
in  principle  how  to  perform  this  calculation  for  any  degree  model. 
Good  practice  in  a regression  analysis  would  require  that  the 
number  of  data  points  taken  exceed  by  a significant  amount  the 
minimum  required  to  just  determine  the  system.  As  a rule  of 
thumb,  try  to  collect  at^  least  twice  as  many  data  points  as  the 
minimum  required. 


CONCLUSION 
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The  method  described  in  this  paper  is  exactly  what  it  purports 
to  be:  the  generalized  standard  addition  method.  It  shares  some 
of  the  disadvantages  of  the  traditional  standard  addition  method 
in  that  response  must  be  zeroed  and  some  additional  effort  to 
transform  the  measured  property  values  into  suitable  form  relating 
response  to  concentrations  may  be  required.  This  transformation 
may  introduce  so-called  "constants",  which  are  constant  in  theory, 
but  are  often  determined  in  practice  by  calibrating  a given  ana- 
lytical sensor  (e.g.,  the  term  RT/nF  in  ion-selective  electrodes). 

If  these  constants  must  be  used  in  the  transformation,  then  apply- 
ing the  GSAM  requires  a calibration  of  the  measuring  instrument, 
which  is  not  the  usual  case  in  the  traditional  standard  addition 
method . 

The  advantage  of  being  able  to  simultaneously  analyze  for 
different  components  in  a mixture,  while  accounting  for  interfer- 
ences , certainly  outweighs  the  relatively  minor  disadvantages. 

The  ability  to  use  non-linear  models  allows  the  analyst  to  better 
fit  the  data  for  solving  for  the  initial  concentrations  of  the 
components  than  may  be  possible  with  a simple  linear  model;  it 
also  allows  the  GSAM  as  presented  here  to  be  used  whenever  a 
transformation  can  bring  the  data  into  a polynomial  relationship 
of  some  degree  from  a known  theoretical  relationship.  The  GSAM 
even  permits  theoretical  studies  to  be  done  on  the  relationship 
between  underlying  independent  and  dependent  variables. 

There  are  a great  many  areas  where  the  GSAM  will  be  applicable, 
especially  where  interferences  have  been  of  concern  in  the  past. 
Indeed,  our  laboratory  is  currently  applying  this  method  to 
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